Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split pragmatics into presuppositions and scalar implicatures #2938

Merged

Conversation

raileymontalan
Copy link
Contributor

No description provided.

@raileymontalan raileymontalan marked this pull request as draft August 16, 2024 09:14
@raileymontalan raileymontalan marked this pull request as ready for review September 6, 2024 14:44
@raileymontalan
Copy link
Contributor Author

Hi @weiqipedia, for your info.

Copy link
Collaborator

@yifanmai yifanmai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. Note that you have to change schema_bhasa.yaml to reflect changes (but that can be done in a separate pull request).

src/helm/benchmark/scenarios/bhasa_scenario.py Outdated Show resolved Hide resolved
src/helm/benchmark/scenarios/bhasa_scenario.py Outdated Show resolved Hide resolved
src/helm/benchmark/scenarios/bhasa_scenario.py Outdated Show resolved Hide resolved
)
# Split "True or False" into ["True", "or", "False"]
choices = row["choices"].split()
choices_translated = row["choices_translated"].split()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work consistently across every (supported) language?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question! For now we only have Indonesian (and Tamil), and this splitting and taking the first and third index of the list does work for both languages. But just FYI, this will not work for Thai because of the lack of spaces, and we'll have to use something more similar to your suggestion of " or " (but we will not be having Thai any time soon)

run_eval.sh Outdated Show resolved Hide resolved
run_eval.sh Outdated Show resolved Hide resolved
src/helm/benchmark/run_specs/bhasa_run_specs.py Outdated Show resolved Hide resolved
if self.language not in self.prompts.keys():
raise (Exception(f"Unsupported language {self.language} - supported languages are {self.prompts.keys()}"))
else:
self.prompt_components = self.prompts[self.language]

def download_dataset(self, output_path: str):
BASE_URL = "https://raw.githubusercontent.com/aisingapore/BHASA/main/lindsea/"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional: You can pin this to a specific commit githash so that future changes to the git won't cause this scenario to change. e.g.

BASE_URL = "https://raw.githubusercontent.com/aisingapore/BHASA/10e34008e8142bef400cf8ffab15b2b6aaf3aa7f/lindsea/"

src/helm/benchmark/scenarios/bhasa_scenario.py Outdated Show resolved Hide resolved
src/helm/benchmark/scenarios/bhasa_scenario.py Outdated Show resolved Hide resolved
src/helm/benchmark/scenarios/bhasa_scenario.py Outdated Show resolved Hide resolved
src/helm/benchmark/scenarios/bhasa_scenario.py Outdated Show resolved Hide resolved
src/helm/benchmark/scenarios/bhasa_scenario.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@yifanmai yifanmai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

dataset = pd.read_json(target_path_file, lines=True)
datasets = []
for subset in self.subsets:
URL = f"{BASE_URL}{self.language}/pragmatics/pragmatic_reasoning_{subset}.jsonl"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: URL should be lowercase (it is not a constant)

@yifanmai yifanmai merged commit 64f23d3 into stanford-crfm:main Oct 1, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants